21 research outputs found

    Minimax Estimation of Kernel Mean Embeddings

    Full text link
    In this paper, we study the minimax estimation of the Bochner integral μk(P):=Xk(,x)dP(x),\mu_k(P):=\int_{\mathcal{X}} k(\cdot,x)\,dP(x), also called as the kernel mean embedding, based on random samples drawn i.i.d.~from PP, where k:X×XRk:\mathcal{X}\times\mathcal{X}\rightarrow\mathbb{R} is a positive definite kernel. Various estimators (including the empirical estimator), θ^n\hat{\theta}_n of μk(P)\mu_k(P) are studied in the literature wherein all of them satisfy θ^nμk(P)Hk=OP(n1/2)\bigl\| \hat{\theta}_n-\mu_k(P)\bigr\|_{\mathcal{H}_k}=O_P(n^{-1/2}) with Hk\mathcal{H}_k being the reproducing kernel Hilbert space induced by kk. The main contribution of the paper is in showing that the above mentioned rate of n1/2n^{-1/2} is minimax in Hk\|\cdot\|_{\mathcal{H}_k} and L2(Rd)\|\cdot\|_{L^2(\mathbb{R}^d)}-norms over the class of discrete measures and the class of measures that has an infinitely differentiable density, with kk being a continuous translation-invariant kernel on Rd\mathbb{R}^d. The interesting aspect of this result is that the minimax rate is independent of the smoothness of the kernel and the density of PP (if it exists). This result has practical consequences in statistical applications as the mean embedding has been widely employed in non-parametric hypothesis testing, density estimation, causal inference and feature selection, through its relation to energy distance (and distance covariance)

    PAC-Bayes-empirical-Bernstein inequality

    Get PDF
    We present PAC-Bayes-Empirical-Bernstein inequality. The inequality is based on combination of PAC-Bayesian bounding technique with Empirical Bernstein bound. It allows to take advantage of small empirical variance and is especially useful in regression. We show that when the empirical variance is significantly smaller than the empirical loss PAC-Bayes-Empirical-Bernstein inequality is significantly tighter than PAC-Bayes-kl inequality of Seeger (2002) and otherwise it is comparable. PAC-Bayes-Empirical-Bernstein inequality is an interesting example of application of PAC-Bayesian bounding technique to self-bounding functions. We provide empirical comparison of PAC-Bayes-Empirical-Bernstein inequality with PAC-Bayes-kl inequality on a synthetic example and several UCI datasets

    Towards a Learning Theory of Cause-Effect Inference

    Full text link
    We pose causal inference as the problem of learning to classify probability distributions. In particular, we assume access to a collection {(Si,li)}i=1n\{(S_i,l_i)\}_{i=1}^n, where each SiS_i is a sample drawn from the probability distribution of Xi×YiX_i \times Y_i, and lil_i is a binary label indicating whether "XiYiX_i \to Y_i" or "XiYiX_i \leftarrow Y_i". Given these data, we build a causal inference rule in two steps. First, we featurize each SiS_i using the kernel mean embedding associated with some characteristic kernel. Second, we train a binary classifier on such embeddings to distinguish between causal directions. We present generalization bounds showing the statistical consistency and learning rates of the proposed approach, and provide a simple implementation that achieves state-of-the-art cause-effect inference. Furthermore, we extend our ideas to infer causal relationships between more than two variables

    Differentially Private Database Release via Kernel Mean Embeddings

    Get PDF
    We lay theoretical foundations for new database release mechanisms that allow third-parties to construct consistent estimators of population statistics, while ensuring that the privacy of each individual contributing to the database protected. The proposed framework rests on two main ideas. First, releasing (an estimate of) the kernel mean embedding of the data generating random variable instead of the database itself still allows third-parties to construct consistent estimators of a wide class of population statistics. Second, the algorithm can satisfy the definition of differential privacy by basing the released kernel mean embedding on entirely synthetic data points, while controlling accuracy through the metric available in a Reproducing Kernel Hilbert Space. We describe two instantiations of the proposed framework, suitable under different scenarios, and prove theoretical results guaranteeing differential privacy of the resulting algorithms and the consistency of estimators constructed from their outputs

    Spatial Evolutionary Generative Adversarial Networks

    Full text link
    Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse. These pathologies mainly arise from a lack of diversity in their adversarial interactions. Evolutionary generative adversarial networks apply the principles of evolutionary computation to mitigate these problems. We hybridize two of these approaches that promote training diversity. One, E-GAN, at each batch, injects mutation diversity by training the (replicated) generator with three independent objective functions then selecting the resulting best performing generator for the next batch. The other, Lipizzaner, injects population diversity by training a two-dimensional grid of GANs with a distributed evolutionary algorithm that includes neighbor exchanges of additional training adversaries, performance based selection and population-based hyper-parameter tuning. We propose to combine mutation and population approaches to diversity improvement. We contribute a superior evolutionary GANs training method, Mustangs, that eliminates the single loss function used across Lipizzaner's grid. Instead, each training round, a loss function is selected with equal probability, from among the three E-GAN uses. Experimental analyses on standard benchmarks, MNIST and CelebA, demonstrate that Mustangs provides a statistically faster training method resulting in more accurate networks
    corecore